Discriminative Features for Language Identification
نویسندگان
چکیده
In this paper we investigate the use of discriminatively trained feature transforms to improve the accuracy of a MAP-SVM language recognition system. We train the feature transforms by alternatively solving an SVM optimization on MAP supervectors estimated from transformed features, and performing a small step on the transforms in the direction of the antigradient of the SVM objective function. We applied this method on the LRE2003 dataset, and obtained an 5.9% relative reduction of pooled equal error rate.
منابع مشابه
A New Phono-Articulatory Feature Representation for Language Identification in a Discriminative Framework
State of the Art language identification methods are based on acoustic or phonetic features. Recently, phono-articulatory features have been included as a new speech characteristic that conveys language information. Authors propose a new phono-articulatory representation of speech in a discriminative framework to identify languages. This simple representation shows good results discriminating b...
متن کاملA language model based approach towards large scale and lightweight language identification systems
Multilingual spoken dialogue systems have gained prominence in the recent past necessitating the requirement for a front-end Language Identification (LID) system. Most of the existing LID systems rely on modeling the language discriminative information from low-level acoustic features. Due to the variabilities of speech (speaker and emotional variabilities, etc.), large-scale LID systems develo...
متن کاملTask-aware deep bottleneck features for spoken language identification
Recently, deep bottleneck features (DBF) extracted from a deep neural network (DNN) containing a narrow bottleneck layer, have been applied for language identification (LID), and yield significant performance improvement over state-of-the-art methods on NIST LRE 2009. However, the DNN is trained using a large corpus of specific language which is not directly related to the LID task. More recent...
متن کاملOffline Language-free Writer Identification based on Speeded-up Robust Features
This article proposes offline language-free writer identification based on speeded-up robust features (SURF), goes through training, enrollment, and identification stages. In all stages, an isotropic Box filter is first used to segment the handwritten text image into word regions (WRs). Then, the SURF descriptors (SUDs) of word region and the corresponding scales and orientations (SOs) are extr...
متن کاملAlignment-Based Discriminative String Similarity
A character-based measure of similarity is an important component of many natural language processing systems, including approaches to transliteration, coreference, word alignment, spelling correction, and the identification of cognates in related vocabularies. We propose an alignment-based discriminative framework for string similarity. We gather features from substring pairs consistent with a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011